AITopics

Country:

North America (0.46)
Asia (0.28)

Industry: Education > Assessment & Standards > Student Performance (0.42)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

FOX NewsMar-26-2025, 09:00:37 GMT

Amazon's AI-generated summary of popular conservative book accuses it of 'extreme' rhetoric

Markowicz previously explained why they wrote the book in a Fox News Digital opinion piece, noting that in 2021, then-Democratic Virginia gubernatorial candidate Terry McAuliffe said, "I don't think parents should be telling schools what they should teach." "Taken on its own, the comment might even be benign. Sure, parental involvement in education had always been a prediction of student success. A 2010 study called'Parent Involvement and Student Academic Performance: A Multiple Mediational Analysis' by researchers at the Warren Alpert Medical School of Brown University and the University of North Carolina at Greensboro found'children whose parents are more involved in their education have higher levels of academic performance than children whose parents are involved to a lesser degree." But should parents be designing a curriculum?

amazon, artificial intelligence, fox new digital, (12 more...)

FOX News

Country:

North America > United States > Virginia (0.25)
North America > United States > North Carolina (0.25)

Industry:

Education > Educational Setting > Higher Education (0.56)
Education > Assessment & Standards > Student Performance (0.36)

Technology: Information Technology > Artificial Intelligence (0.34)

Nguyen, Anh Duc, Phi, Hieu Minh, Ngo, Anh Viet, Trieu, Long Hai, Nguyen, Thai Phuong

Investigating Recent Large Language Models for Vietnamese Machine Reading Comprehension

arXiv.org Artificial IntelligenceMar-23-2025

Large Language Models (LLMs) have shown remarkable proficiency in Machine Reading Comprehension (MRC) tasks; however, their effectiveness for low-resource languages like Vietnamese remains largely unexplored. In this paper, we fine-tune and evaluate two state-of-the-art LLMs: Llama 3 (8B parameters) and Gemma (7B parameters), on ViMMRC, a Vietnamese MRC dataset. By utilizing Quantized Low-Rank Adaptation (QLoRA), we efficiently fine-tune these models and compare their performance against powerful LLM-based baselines. Although our fine-tuned models are smaller than GPT-3 and GPT-3.5, they outperform both traditional BERT-based approaches and these larger models. This demonstrates the effectiveness of our fine-tuning process, showcasing how modern LLMs can surpass the capabilities of older models like BERT while still being suitable for deployment in resource-constrained environments. Through intensive analyses, we explore various aspects of model performance, providing valuable insights into adapting LLMs for low-resource languages like Vietnamese. Our study contributes to the advancement of natural language processing in low-resource languages, and we make our fine-tuned models publicly available at: https://huggingface.co/iaiuet.

large language model, llama 3, machine learning, (20 more...)

2503.18062

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Education > Assessment & Standards > Student Performance (0.62)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

FOX NewsMar-22-2025, 14:01:24 GMT

Texas private school's use of new 'AI tutor' rockets student test scores to top 2% in the country

Alpha School co-founder Mackenzie Price and a junior at the school, Elle Kristine, join'Fox & Friends' to discuss the benefits of incorporating artificial intelligence into the classroom. A Texas private school is seeing student test scores soar to new heights following the implementation of an artificial intelligence (AI) "tutor." At Alpha School in Austin, Texas, students are placed in the classroom for two hours a day with an AI assistant, using the rest of the day to focus on skills like public speaking, financial literacy, and teamwork. "We use an AI tutor and adaptive apps to provide a completely personalized learning experience for all of our students, and as a result our students are learning faster, they're learning way better. In fact, our classes are in the top 2% in the country," Alpha School co-founder Mackenzie Price told "Fox & Friends." Will A.I. make schools'obsolete,' or does it present a new'opportunity' for the education system?

alpha school, artificial intelligence, texas private school, (8 more...)

FOX News

Country: North America > United States > Texas > Travis County > Austin (0.26)

Industry:

Education > Educational Setting (1.00)
Education > Assessment & Standards > Student Performance (0.62)
Education > Educational Technology > Educational Software > Computer Based Training (0.38)

Technology: Information Technology > Artificial Intelligence (1.00)

Qwaider, Chatrine, Alhafni, Bashar, Chirkunov, Kirill, Habash, Nizar, Briscoe, Ted

Enhancing Arabic Automated Essay Scoring with Synthetic Data and Error Injection

arXiv.org Artificial IntelligenceMar-22-2025

Automated Essay Scoring (AES) plays a crucial role in assessing language learners' writing quality, reducing grading workload, and providing real-time feedback. Arabic AES systems are particularly challenged by the lack of annotated essay datasets. This paper presents a novel framework leveraging Large Language Models (LLMs) and Transformers to generate synthetic Arabic essay datasets for AES. We prompt an LLM to generate essays across CEFR proficiency levels and introduce controlled error injection using a fine-tuned Standard Arabic BERT model for error type prediction. Our approach produces realistic human-like essays, contributing a dataset of 3,040 annotated essays. Additionally, we develop a BERT-based auto-marking system for accurate and scalable Arabic essay evaluation. Experimental results demonstrate the effectiveness of our framework in improving Arabic AES performance.

large language model, machine learning, natural language, (20 more...)

2503.17739

Country:

Asia > Thailand (0.14)
Europe > Ukraine (0.14)
Europe > Germany (0.14)
Europe > France (0.14)

Genre:

Overview (0.93)
Research Report > New Finding (0.48)
Personal > Interview (0.46)

Industry:

Education > Assessment & Standards > Student Performance (1.00)
Education > Educational Technology > Educational Software > Computer-Aided Assessment (0.71)
Education > Educational Technology > Educational Software > Computer Based Training (0.61)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Neural Information Processing SystemsMar-19-2025, 05:00:51 GMT

Efficient multi-prompt evaluation of LLMs, Lucas Weber

Most popular benchmarks for comparing LLMs rely on a limited set of prompt templates, which may not fully capture the LLMs' abilities and can affect the reproducibility of results on leaderboards. Many recent works empirically verify prompt sensitivity and advocate for changes in LLM evaluation. In this paper, we consider the problem of estimating the performance distribution across many prompt variants instead of finding a single prompt to evaluate with. We introduce PromptEval, a method for estimating performance across a large set of prompts borrowing strength across prompts and examples to produce accurate estimates under practical evaluation budgets. The resulting distribution can be used to obtain performance quantiles to construct various robust performance metrics (e.g., top 95% quantile or median). We prove that PromptEval consistently estimates the performance distribution and demonstrate its efficacy empirically on three prominent LLM benchmarks: MMLU, BIG-bench Hard, and LMentry; for example, PromptEval can accurately estimate performance quantiles across 100 prompt templates on MMLU with a budget equivalent to two single-prompt evaluations. Moreover, we show how PromptEval can be useful in LLM-as-a-judge and best prompt identification applications.

large language model, machine learning, natural language, (19 more...)

Country: North America > United States (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Information Technology (0.67)
Education > Assessment & Standards (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsMar-19-2025, 01:20:26 GMT

Generating Correct Answers for Progressive Matrices Intelligence Tests

Raven's Progressive Matrices are multiple-choice intelligence tests, where one tries to complete the missing location in a 3 3 grid of abstract images. Previous attempts to address this test have focused solely on selecting the right answer out of the multiple choices. In this work, we focus, instead, on generating a correct answer given the grid, without seeing the choices, which is a harder task, by definition. The proposed neural model combines multiple advances in generative models, including employing multiple pathways through the same network, using the reparameterization trick along two pathways to make their encoding compatible, a dynamic application of variational losses, and a complex perceptual loss that is coupled with a selective backpropagation procedure. Our algorithm is able not only to generate a set of plausible answers, but also to be competitive to the state of the art methods in multiple-choice tests.

artificial intelligence, contrast, machine learning, (18 more...)

Genre: Research Report (0.48)

Industry: Education > Assessment & Standards > Measuring Intelligence (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Malnatsky, Elena, Wang, Shenghui, Hindriks, Koen V., Ligthart, Mike E. U.

Dialogic Learning in Child-Robot Interaction: A Hybrid Approach to Personalized Educational Content Generation

arXiv.org Artificial IntelligenceMar-19-2025

Dialogic learning fosters motivation and deeper understanding in education through purposeful and structured dialogues. Foundational models offer a transformative potential for child-robot interactions, enabling the design of personalized, engaging, and scalable interactions. However, their integration into educational contexts presents challenges in terms of ensuring age-appropriate and safe content and alignment with pedagogical goals. We introduce a hybrid approach to designing personalized educational dialogues in child-robot interactions. By combining rule-based systems with LLMs for selective offline content generation and human validation, the framework ensures educational quality and developmental appropriateness. We illustrate this approach through a project aimed at enhancing reading motivation, in which a robot facilitated book-related dialogues.

large language model, machine learning, natural language, (14 more...)

2503.15762

Country:

Europe (0.69)
North America > United States (0.46)
Asia > Japan > Hokkaidō (0.14)

Genre: Research Report (0.64)

Industry: Education > Assessment & Standards (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.85)

Neural Information Processing SystemsMar-17-2025, 21:28:04 GMT

An Autoencoder-Like Nonnegative Matrix Co-Factorization for Improved Student Cognitive Modeling

Student cognitive modeling (SCM) is a fundamental task in intelligent education, with applications ranging from personalized learning to educational resource allocation. By exploiting students' response logs, SCM aims to predict their exercise performance as well as estimate knowledge proficiency in a subject. Data mining approaches such as matrix factorization can obtain high accuracy in predicting student performance on exercises, but the knowledge proficiency is unknown or poorly estimated. The situation is further exacerbated if only sparse interactions exist between exercises and students (or knowledge concepts). To solve this dilemma, we root monotonicity (a fundamental psychometric theory on educational assessments) in a co-factorization framework and present an autoencoder-like nonnegative matrix co-factorization (AE-NMCF), which improves the accuracy of estimating the student's knowledge proficiency via an encoder-decoder learning pipeline.

autoencoder-like nonnegative matrix co-factorization, machine learning, simulation of human behavior, (4 more...)

Industry: Education > Assessment & Standards (0.62)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.81)
Information Technology > Artificial Intelligence > Cognitive Science > Simulation of Human Behavior (0.42)

arXiv.org Artificial IntelligenceMar-17-2025

Rendering Transparency to Ranking in Educational Assessment via Bayesian Comparative Judgement

Gray, Andy, Rahat, Alma, Lindsay, Stephen, Pearson, Jen, Crick, Tom

Ensuring transparency in educational assessment is increasingly critical, particularly post-pandemic, as demand grows for fairer and more reliable evaluation methods. Comparative Judgement (CJ) offers a promising alternative to traditional assessments, yet concerns remain about its perceived opacity. This paper examines how Bayesian Comparative Judgement (BCJ) enhances transparency by integrating prior information into the judgement process, providing a structured, data-driven approach that improves interpretability and accountability. BCJ assigns probabilities to judgement outcomes, offering quantifiable measures of uncertainty and deeper insights into decision confidence. By systematically tracking how prior data and successive judgements inform final rankings, BCJ clarifies the assessment process and helps identify assessor disagreements. Multi-criteria BCJ extends this by evaluating multiple learning outcomes (LOs) independently, preserving the richness of CJ while producing transparent, granular rankings aligned with specific assessment goals. It also enables a holistic ranking derived from individual LOs, ensuring comprehensive evaluations without compromising detailed feedback. Using a real higher education dataset with professional markers in the UK, we demonstrate BCJ's quantitative rigour and ability to clarify ranking rationales. Through qualitative analysis and discussions with experienced CJ practitioners, we explore its effectiveness in contexts where transparency is crucial, such as high-stakes national assessments. We highlight the benefits and limitations of BCJ, offering insights into its real-world application across various educational settings.

assessment, machine learning, natural language, (16 more...)

2503.15549

Country: Europe > United Kingdom > England (0.14)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Instructional Material (1.00)
Personal > Interview (0.67)

Industry:

Education > Educational Setting (1.00)
Education > Assessment & Standards (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)